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Chapter  1 

Introduction 


Continuations  control  program  flow  using  purely  functional  means.  Informally,  a  contin¬ 
uation  is  a  function  representing  the  rest  of  the  program:  when  passed  an  intermediate 
result  (a  value  in  a  functional  language,  a  store  in  an  imperative  language),  the  function 
“continues”  the  computation  to  the  final  result.  In  LISP  programs,  for  example,  the  control 
stack  can  be  thought  of  as  representing  the  continuation  of  a  program:  the  stack  tells  the 
interpreter  how  to  continue  the  computation  to  the  final  answer.  At  a  lower  level,  a  program 
counter  also  represents  a  continuation,  although  the  “function”  may  not  be  very  clear. 

The  explicit  use  of  continuations  pervades  the  theory  and  practice  of  programming 
languages.  Continuations  first  appeared  in  continuation-style  semantics  for  imperative  lan¬ 
guages  [11,  30,  31].  In  this  style,  continuations  are  explicitly  passed  to  the  meanings  of  all 
program  statements.  The  meaning  of  imperative  statements  can  be  modeled  as  functions 
that  change  the  continuation.  For  example,  in  an  ALGOL-like  language  with  goto  <label> 
statements,  each  label  marks  a  particular  continuation.  The  meaning  of  the  statement 
goto  <label>  is  one  that,  upon  receiving  a  store  and  a  continuation,  discards  that  contin¬ 
uation  and  passes  the  store  to  the  continuation  associated  with  <label>.  Highly  imperative 
constructs  like  goto  are  difficult  or  impossible  to  represent  in  “direct”  semantics  in  which 
statements  are  modeled  as  functions  from  answers  to  answers  [11,  30]. 

Continuations  appear  in  at  least  two  other  settings.  In  languages  such  as  LISP  and 
Scheme,  the  continuation  of  a  program  may  be  accessed  through  the  control  operator 
call-with-current-continuation  (call/cc)  [23].  The  programmer  may  then  use  the 
continuation  to  repeat  certain  calculations,  perform  error  traps,  backtrack  through  a  com¬ 
putation,  or  simulate  forks  and  joins  [10].  Continuations  have  also  been  used  in  compilers 
for  languages  such  as  Scheme  and  ML.  These  compilers  apply  a  continuation-passing  style 
(cps)  transform  as  a  fundamental  step  in  compilation  [1,  9,  28]. 

Each  of  the  three  settings  involves  “programming”  with  continuations,  and  it  is  almost 
self-evident  that  this  requires  a  different  style  of  thinking.  What  is  not  obvious,  however, 
is  whether  working  in  a  continuation  setting  requires  new  reasoning  tools.  Indeed,  certain 
principles  should  remain  valid  in  the  context  of  continuaticns.  For  example,  the  substitution 
of  actual  parameters  for  formal  parameters  in  procedure  calls  should  not  become  invalid — 
otherwise,  the  addition  of  continuations  would  change  the  programming  language  in  drastic 
ways! 

On  the  other  hand,  the  mere  addition  of  continuation-based  control  operators  to  lan¬ 
guages  suggests  that  continuations  change  programming  in  a  fundamental  way.  In  the 
presence  of  control  operators,  a  programmer  may  be  able  to  distinguish  pieces  of  code  that 
were  indistinguishable  without  control  operators,  making  the  language  more  powerful.  One 
can  make  similar  arguments  for  the  other  two  settings.  For  instance,  programs  not  ex¬ 
pressible  when  programming  directly  in  the  language  become  expressible  when  using  cps 


1 


converted  code. 

This  thesis  attempts  to  make  precise  the  intuition  that  continuations  “change  things” 
in  the  three  settings  of  continuations.  Using  specific  counterexamples,  we  shall  prove  that 
certain  familiar  reasoning  principles  are  unsound  in  the  three  settings  of  continuations.  In 
essence,  reasoning  about  code  in  the  usual  way  may  lead  one  to  draw  faulty  conclusions 
about  the  behavior  of  that  code.  By  understanding  the  failure  of  reasoning  principles  in 
each  of  the  three  settings  of  continuations,  we  move  closer  to  understanding  continuations 
themselves;  insights  generated  by  the  examples  will  help  in  building  a  suitable  theory  of 
continuations. 


1.1  Reasoning  about  Code 

By  “reasoning  principles”  we  mean  principles  for  proving  equivalences  of  code.  Such  prin¬ 
ciples  capture  the  notion  of  “behavior  of  code.”  For  example,  a  A-abstraction  applied  to  an 
integer  argument  in  LISP  behaves  the  same  (ignoring  efficiency  issues)  as  the  body  of  the 
abstraction  with  the  integer  in  place  of  the  abstracted  variable.  These  two  pieces  of  code  are 
equivalent,  and  the  definition  of  a  LISP  interpreter  may  be  used  to  verify  this  equivalence. 

Two  pieces  of  code  are  “equivalent”  if  they  produce  the  same  “outcomes”  under  the 
interpreter.  To  make  this  more  precise,  we  must  define  the  observations,  the  net  outcomes 
of  the  interpreter  considered  important.  Typically,  we  choose  to  observe  terms  at  which  the 
interpreter  stops.  In  the  language  A„  defined  in  Chapter  2,  we  will  observe  evaluation  to 
numerals.1  Let  Eval(M)  be  a  partial  function  from  terms  to  terms,  representing  the  output 
of  the  interpreter  on  terms;  we  then  say 

Definition  1.1  (Informal)  Two  terms  M  and  N  are  observationally  equivalent  if 
Eval(M)  and  Eval(N)  agree  on  all  observations. 

Two  programs  are  observationally  equivalent  if  they  produce  the  same  observable  results. 

Observational  equivalence  states  that  two  terms  as  g  ven  cannot  be  told  apart  by  the 
interpreter.  For  languages  with  functional  terms,  observational  equivalence  is  too  coarse; 
one  may  still  be  able  to  distinguish  two  observationally  equivalent  terms.  For  instance,  if  we 
choose  to  observe  “termination  of  the  interpreter”  in  LISP,  any  two  A-abstractions  would 
agree  on  all  observations  and  hence  would  be  considered  observationally  equivalent.  Yet 
a  programmer  may  be  able  to  distinguish  two  A-abstractions  by  writing  a  context  (a  term 
with  a  hole)  that  makes  the  terms  evaluate  to  different  observations.  One  may  formalize 
this  ability  to  distinguish  terms: 

Definition  1.2  (Informal)  Two  terms  M  and  N  are  observationally  distinguishable 
iff  for  some  context  C[-},  C[M ]  and  C[IV]  differ  on  some  observation  (in  other  words,  are 
not  observationally  equivalent.) 

The  complementary  notion  is,  in  fact,  more  important: 

Definition  1.3  (Informal)  Two  terms  M  and  N  are  observationally  congruent  (writ¬ 
ten  M  ~0b,  N )  iff  they  are  not  observationally  distinguishable. 

1  More  complex  observations  may  result  in  finer  distinctions  between  terms;  see  [4,  17]  for  an  example  of 
another  reasonable  notion  of  observation. 
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Observational  congruence  is  the  congruence  closure  of  observational  equivalence. 

From  a  software  engineering  perspective,  observational  congruence  captures  the  notion 
of  “modularity”  of  code.  For  example,  two  routines  that  “sort”  should  be  observationally 
congruent:  the  “sort”  routines  should  be  interchangeable  in  any  program,  and  the  program 
should  produce  the  same  answers  using  either  routine.  Observational  congruence  also  pro¬ 
vides  one  definition  of  a  “correct”  compiler  optimization:  if  one  piece  of  code  is  replaced  by 
a  faster  yet  observationally  congruent  piece,  the  optimization  is  “safe,”  i.e.,  the  optimized 
code  will  still  produce  the  expected  answer. 

When  we  say  “reasoning  about  code,”  we  mean  reasoning  used  to  prove  observational 
congruences.  In  fact,  almost  any  reasoning  principle  may  be  viewed  as  a  way  to  verify 
observational  congruences.  For  instance,  fixpoint  induction  in  denotational  semantics  and 
pure  A-calculus-like  equational  reasoning  are  reasoning  tools  for  proving  congruences.  These 
formal  reasoning  principles  help  justify  the  informal  observational  congruence  reasoning 
used  by  programmers,  clarifying  common  assumptions  about  the  behavior  of  code. 

1.2  Outline  of  Thesis 

We  concentrate  on  the  setting  of  cps  conversion,  since  the  cps  transform  seems  fundamental 
to  understanding  the  other  two  settings  of  continuations,  a  continuation  transform  forms 
the  basis  of  many  continuation  semantics  (cf.  {24,  26,  30])  and  is  often  used  to  describe 
the  semantics  of  call/cc-like  operators  (cf.  [7,  8].)  Chapter  2  describes  a  call-by-value 
functional  language  A„  and  its  continuation  transform,  both  of  which  are  the  focus  of  study. 

In  Chapter  3,  we  describe  specific  examples  that  show  the  failure  of  reasoning  princi¬ 
ples  based  on  observational  congruence.  These  examples  will  have  the  form  UM  and  N  are 
observationally  congruent  but  not  congruent  in  one  of  the  continuation  settings.”  In  par¬ 
ticular,  we  show  that  two  terms  may  be  observationally  congruent  but  their  cps-transforms 
may  not  be.  Similar  observations  are  also  made  for  the  other  two  settings  of  continuations. 

The  unsoundness  of  familiar  reasoning  principles  indicates  that  a  theory  of  continuations 
remains  to  be  found.  Chapter  4  discusses  possible  directions  for  such  a  theory.  One  method 
(currently  being  pursued)  involves  extending  the  retraction-based  method  of  Meyer  and 
Wand  [15].  One  might  also  seek  results  tying  the  three  settings  of  continuations  together. 
Finally,  an  Appendix  is  included  which  contains  proofs  of  “standard”  theorems  for  A„. 
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Chapter  2 

The  Language  and  its  Continuation  Transform 


This  chapter  defines  A,,,  a  call-by-value  version  of  the  language  PCF  [20,  25],  including  an 
interpreter  for  A^.  A  call-by-value  continuation  transform  for  the  language  is  then  given, 
along  with  theorems  that  show  the  correctness  of  the  transform. 


2.1  Syntax 

The  familiar  syntax  of  the  simply- typed  A-calculus  forms  the  basis  of  A„.  Each  term  in  Av 
has  a  type  of  the  form  o  or  (a  —*  r),  where  o  is  the  sole  base  type,  the  type  of  natural 
numbers,  and  a  — ►  r  is  the  type  of  functions  from  a  to  r.1  The  set  of  terms  with  their 
corresponding  types  is  defined  in  Figure  2.1.  In  this  definition  and  throughout  the  text, 
Greek  letters  (with  the  exception  of  k,  A,  and  //)  denote  types,  uppercase  Roman  letters 


x°  :  a 

r-.a 

ci  :  o 

succ,pred  :  o  — »  o 
(cond  B  M  N)  :  o 
(M  N)  :  t 
(A xa.M)  :  a  — >  r 
(fif'.M)  :  a 


—  A- variables,  where  x  €  C 

—  //-variables,  where  /  €  M 

—  numerals  (/  >  0) 

—  functional  constants 

—  conditionals,  where  B,M,N  :  o 

—  applications,  where  M  :  o  -*  r  and  N  :  a 

—  A-abstractions,  where  M  :  r 

—  recursive  definitions,  where  M  :  a 


Figure  2.1:  The  syntax  for  A„;  here,  C  and  M  are  two  disjoint,  infinite  sets  of  variables.  Each 
variable  in  A„  is  tagged  with  a  type  (c/.  [20]),  but  types  will  often  be  dropped  when  the  context  is 
clear. 

denote  terms,  the  lowercase  letters  /,  g ,  and  h  are  //-variables,  and  all  other  letters  (e.g., 
k,  a,  b ,  c)  are  A- variables  except  when  otherwise  stated. 

The  A-  and  //-variables  occurring  in  a  term  may  be  bound  or  free  [2].  If  two  terms  M 
and  N  differ  only  in  the  names  of  bound  variables,  we  consider  them  to  be  syntactically 
equivalent  and  write  M  =  N  [2].  A  term  is  closed  if  it  contains  no  free  variables;  otherwise, 
a  term  is  open. 

Contexts  are  special  terms  containing  holes.  A  context  C\-}  is  derived  from  a  term  M 
by  replacing  all  free  occurrences  of  some  variable  in  M,  say  /<T,  by  a  hole  [•].  C[A]  is  the 
result  of  replacing  every  hole  in  C[-\  with  N,  where  N  :  a  and  the  type  of  the  hole  is  a. 

1  As  is  customary,  parentheses  will  frequently  be  dropped  from  types  with  the  understanding  that  — * 
associates  to  the  right.  For  example,  o  — *  o  — ►  o  is  short  for  (o  — » (o  — *  o)). 


(Ax.A/)  V  — M[x  :=  V],  V  a  value 

pf.M  -*v 

M[f  :=  pf.M\ 

SUCC  Cl  — *•„ 

pred  c0  -*v 

Co 

cond  Co  Mo  M 1  — *■„  M0 

pred  ct+i  — 

Cl 

cond  c/+i  Mo  Mi  — ►„  M\ 

B  —>  B' 

M  — ►„  M',c  €  {succ,  pred) 

cond  B  Mo  M\  — >v  cond  B'  Mq  M\ 

cM  -* 

v  c  M' 

M  M' 

N  — 

N ' 

V  M 

M  N  —>v  M'  N 

(A x.M)  N  ~ 

"  (A x.M)  N ' 

Figure  2.2:  Structured  rewrite  rules  for  \v.  Substitution  of  the  term  N  for  the  variable  i  in  M, 
with  the  necessary  renaming  of  bound  variables,  is  written  M[x  :=  /V]  (see  [2]  for  a  formal  definition.) 

2.2  Operational  Semantics 

The  relation  — the  one-step  reduction  relation  on  terms  of  A„,  is  defined  in  Figure  2.2 
using  a  structured  operational  semantics  [19,  21].  In  reducing  applications,  operands  are 
substituted  in  for  A-bound  variables  only  when  the  operand  is  a  value.  A  value  (usually 
denoted  by  V)  is  a  A-abstraction,  a  constant,  or  a  A-variable.  None  of  these  terms  can  be 
rewritten  using  — so  a  value  is  a  term  in  evaluated  form.2 

It  is  relatively  easy  to  see  from  the  fact  that  values  are  stopped  that  — *•„  is  deterministic. 
This  allows  us  to  define  an  interpreter  for  A„  from  — Since  A„  is  a  language  for  arithmetic, 
we  choose  the  final  answers  of  the  interpreter  to  be  numerals.  The  input  to  an  interpreter 
for  A„  should  therefore  be  closed  terms  of  base  type  which  we  call  complete  programs. 
(A  complete  program  is  a  program  coupled  with  a  particular  set  of  inputs.)  The  reflexive, 
transitive  closure  of  the  relation  — -»v,  can  be  used  to  define  a  partial  recursive  function 

Evalv:  Complete  programs  — > -  Numerals 

jp  I  (  i.-\  _  f  Q  ^  ^  —V 

Evalv(M)  -  |  undefined  otherwise 
which  is  an  interpreter  for  the  language. 

In  our  investigation  of  the  cps  transform  we  will  be  most  interested  in  reasoning  about 
the  behavior  of  code  under  Evalv.  We  say  that 

Definition  2.1  M  observationally  approximates  N,  written  M  <v  N,  if,  for  any  con¬ 
text  C[-]  such  that  C[M]  and  C[A]  are  complete  programs,  C[M ]  c;  implies  C[A]  c/. 

Two  terms  M  and  N  are  observationally  congruent,  written  M  N,  if  M  <v  N  and 
N  <v  M. 

Observational  congruences  can  be  difficult  to  prove  using  only  the  definition  [12].  For 
example,  consider  the  terms  N\  =  Az.(A y.y)  C3  and  N2  =  AX.C3.  If  Ni  is  applied  to  an 

2 Using  this  rationale,  ^-variables  might  also  be  considered  values,  if  it  were  not  for  the  fact  that  ft- 
variables  may  be  replaced  by  terms  that  require  further  evaluation.  For  example,  /  gets  replaced  by  a 
non-value  in  the  reduction  *»/./  — *«  /[/  :=  *»/•/].  In  contrast,  A-variables  remain  values  when  reduced  and 
hence  are  considered  values.  This  distinction  explains  the  need  for  two  disjoint  sets  of  variables.  Plotkin 
also  uses  two  sets  of  variables  in  one  version  of  his  metalanguage  [22]. 
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argument  during  the  evaluation  of  a  program,  the  “active”  subterm  at  the  next  stage  will 
be  (Aj/.y)  C3  which  will  reduce  to  C3.  If  iV2  appeared  as  the  subterm  instead,  C3  will  again 
be  the  result.  The  terms  should  thus  be  congruent.  This  argument,  however,  is  difficult  to 
formalize  and  is  of  little  use  in  proving  other  observational  congruences. 

Equational  reasoning  based  on  — can  be  used  to  prove  =°ba  7V2.  Define  the  relation 
=v  by  replacing  all  — ►u’s  in  the  definition  of  — ►„  by  =„’s,  adding  the  axioms  reflexivity, 
symmetry,  and  transitivity,  and  condensing  the  operational  rules  with  antecedents  into  the 
congruence  rule 

M  =„  M' 

C[M]  =v  C[M'] 

where  C[>]  is  any  context  (net  necessarily  making  C[M )  a  complete  program.)  The  rules  of 
=„  are  sound  for  proving  observational  congruences. 

Theoren  2.2  If  M  =u  N,  then  M  =”b3  N . 

Proof:  Delayed  to  the  Appendix.  ■ 

N\  =oba  ^2  now  follows  from  the  fact  that  N\  =v  7V2. 

The  converse  to  Theorem  2.2  is  false:  there  are  terms  that  are  observationally  congru¬ 
ent  but  cannot  be  proven  equivalent.3  The  following  theorem  will  be  useful  in  verifying 
congruences: 

Theorem  2.3  Let  M  and  N  be  closed  terms  of  the  same  type.  Then  M  <v  N  iff,  for  all 
vectors  V  of  closed  values,  M  V  -»v  Vq  implies  N  V  -»■„  V{  and  F0'  =  V(  if  either  is  a 
numeral. 

Proof:  Delayed  to  the  Appendix.  ■ 

Theorem  2.3  states  that  applicative  contexts  determine  observational  congruence  (cf.  [3].) 

2.3  Continuation  Transform 

2.3.1  Definition 

The  continuation  transform  for  A„  is  based  on  a  cps  transform  appropriate  for  call- by- value 
[9,  15,  19].  The  transform  of  a  term  M,  written  M,  is  another  term  of  A„.  Figure  2.3  defines 
the  transform  of  a  term  by  structural  induction  on  the  term. 

The  behavior  of  the  interpreter  for  A„  provides  clues  to  understanding  the  continuized 
version  of  a  term.  Basically,  the  flow  of  control  is  made  explicit  by  the  continuations  of 
a  cps-converted  term.  For  example,  since  values  are  not  evaluated,  the  cps  transform  of  a 
value  simply  passes  the  value  to  a  continuation  (the  rest  of  the  program.)  For  applications 
as  well,  the  continuations  in  the  transform  of  an  application  mimic  the  flow  of  control  in  the 
interpreter:  the  continuation  passed  to  the  operator  first  evaluates  the  operand  and  passes 
control  to  the  operand’s  continuation,  which,  in  turn,  applies  the  operator  to  the  operand. 

The  explicit  incorporation  of  continuations  requires  that  the  transform  change  the  type 
of  a  term.  A  continuized  term  accepts  a  continuation  as  an  argument  (a  function  from  some 
type  to  a  final  answer),  and  produces  a  final  answer  given  that  continuation.  The  type  of 
final  answers  for  Av  is  o,  so  a  term  of  type  o  is  transformed  into  a  term  of  type  (o  — *  o)  — »  o. 

3In  fact,  observational  congruence  is  not  axiomatizable  [2,  32],  so  one  cannot  hope  for  an  equational  proof 
system  that  captures  observational  congruence. 
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x° 

= 

T 

= 

Cl 

succ 

= 

A K.K  ( Xx°.Xki.Ki  (succ  2)) 

pred 

= 

Xk.k  (Xx°.Xki.ki  (pred  x)) 

cond  B  Mo  M\ 

z= 

XK.B(Xm°.cond  m  (Mo  k)  (M\  k)) 

(M  N) 

= 

Xk.M  (A mSa~*TV .N  (Xna' .m  n  /«)) 

(where  M  :  0  — *■  r  and  N  :  0) 

Xxa.M 

= 

Xk.k  (A xa> .M) 

\mmamM 

= 

Xk.(h /(<t'-oHoJJ)  k 

Figure  2.3:  The  continuation  transform  for  Xv .  The  types  of  continuations  k  (which  have  the  form 
a'  — v  o)  have  been  omitted  for  clarity.  Note  that  variables  change  types  when  transformed. 

The  situation  for  higher- typed  terms  is  more  complicated.  The  continuation  of  a  higher- 
order  term  needs  to  accept  functions  which,  given  a  value  and  another  continuation,  produce 
final  answers.  The  transform  of  a  term  of  type  a  is  thus  a  term  of  type  (a'  — *■  o)  — *•  o,  where 
a'  is  defined  recursively  by  (cf.  [15]) 

o'  =  o 

(a  -*  tY  =  o'  -+  (t' 

2.3.2  Fundamental  Properties  of  the  Transform 

By  inspecting  the  definition  of  the  transform,  one  may  observe  that  every  operand  in  a 
transformed  term  is  a  value  and  hence  need  not  be  evaluated.  In  other  words,  transformed 
terms  may  be  evaluated  tail-recursively.  Tail-recursiveness  can  lead  to  increased  efficiency. 
A  traditional  call-by-value  interpreter  (or  code  generated  by  compilers)  uses  a  stack  to 
remember  the  position  of  the  subterm  currently  being  evaluated.  In  transformed  terms,  all 
operands  in  applications  are  in  evaluated  form,  so  an  interpreter  designed  specifically  for 
transformed  terms  does  not  require  a  stack.4 

A  corollary  to  the  fact  that  all  operands  are  values  is  unambiguous  reducibility:  call- 
by-name  and  call-by- value  reduction  strategies  coincide  on  transformed  terms.  Unambigu¬ 
ous  reducibility  allows  one  to  use  the  transform  to  simulate  call-by-value  in  a  call-by-name 
interpreter,  as  is  done  in  [19,. 

Of  course,  the  transform  must  satisfy  correctness  properties  as  well.  If  one  expects 
to  use  the  transform  as  a  first  step  in  compilation,  for  example,  transformed  terms  must 
not  produce  different  answers  than  the  original  terms!  The  continuation  transform  for  the 
language  satisfies  two  properties  that  guarantee  its  correctness:  provable  equality  (i.e.,  =„) 
is  preserved  by  the  transform  and  complete  programs  produce  the  same  output  as  their 
transformed  versions  [9,  19]. 

2.3. 2.1  Preservation  of  equational  reasoning 

We  follow  Plotkin’s  proof  in  [19]  to  show  that  M  =v  N  implies  M  =„  N . 

'One  may  regard  the  cps  version  of  a  term  as  incorporating  an  explicit  representation  of  the  interpreter’s 
control  stack. 
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Substitutions  performed  by  =„  pose  problems  to  a  direct  proof.  Suppose,  for  example, 
that  =„  performs  the  substitution  M[x  :=  V].  We  want  (A x.M)  V  =„  M[x  :=  V].  In  point 
of  fact,  it  is  easy  to  show  that  (A x.M)  V  =„  M[x  :=  ^(V)],  where 

Definition  2.4  If  V  is  a  value,  then  tf'(V)  is  defined 

•  tf(x')  =  x°' ; 

•  #{ci)  =  ci; 

•  tf'(succ)  =  Ax.Aki.kx  (succx); 

•  tf'(pred)  =  Ax.Aki.ki  (predx); 

•  *{\x°.M)  =  \xa'M. 

(Essentially,  i'(V')  is  V  without  the  leading  continuation.)  The  following  lemma  allows  us 
to  complete  the  argument  that  (Ax.M)  V  =XJ  M[x  :=  V]: 

Lemma  2.5  If  V  is  a  value  and  x  is  a  X-variable,  then  M[xa'  :=  tf^F)]  =  M[x°  :=  V]. 

Proof:  By  structural  induction  on  M.  For  the  base  case,  M  must  be  a  constant  or  variable: 

Case  1:  M  =  x.  Then  M[x  :=  9(V)]  =  Ak.k  V(V)  =  V  =  M[x  :=  V]. 

Case  2:  M  =  t  for  some  variable  t  £  x.  Then  M[x  '&{V))  —  t  =  M[x  :=  V]. 

Case  3:  M  =  a  for  some  constant  a.  Similar  to  Case  2. 

For  the  induction  case,  we  also  divide  into  cases  depending  on  the  form  of  M: 

Case  1:  M  =  cond  B  M0  Mt.  Then 

M[x  :=  «Pr( V )]  =  (A k.B  (Am.cond  m  (Mo  «)  (Mi  k)))[x  :=  ^(V)] 

=  Ak.B[x  :=  V ]  (Am.ccnd  m  (Mo[x  :=  V]  k)  (Mi[x  :=  V]  k)) 

(by  the  induction  hypothesis) 

=  M[x  :=  V). 

Case  2:  M  =  (Mi  M2).  Then 

M[x  :=  & (V )]  =  (Ak.Mi  (A ro.M2  (A n.m  n  k)))[x  :=  &(V)} 

=  Ak.Mi [x  :=  V]  (Am.M2[x  :=  V]  (A n.m  n  k)) 

(by  the  induction  hypothesis) 

=  M[x  :=  V ]. 

Case  3:  M  =  Ay.M'.  If  y  =  x,  then  M[x  :=  tP^V)]  =  M  =  M[x  :=  V].  If  y  x, 

M[x  :=  tf'(V')]  =  (Ak.k  (Ay-M7))^  :=  #{V)) 

=  Ak.k  (Ay.M'[x  :=  V]) 

(by  the  induction  hypothesis) 

=  M[x  :=  V). 
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Case  A:  M  =  fi f.M'.  We  know  that  x  ^  /;  so 

M[x  :=  V(V)}  =  k)[x  :=  #(V)] 

=  \K.((if.M'[x  :=  V])  k 

(by  the  induction  hypothesis) 

=  M[x  :=  V]. 

We  have  exhausted  all  cases,  hence  the  lemma  holds.  ■ 

The  analog  of  Lemma  2.5  for  recursive  definitions  works  somewhat  more  easily: 

Lemma  2.0  If  f  is  a  fi- variable ,  then  M :=  =  M[fa  :=  fifc.N], 

Proof:  By  structural  induction  on  M.  In  the  base  case,  we  divide  into  cases  on  the  form 
of  M  : 

Case  1:  M  =  f.  Then  M[f  :=  ft  f.N]  =  \k.(h/.N)  k  =  JTfJf  =  M[f  :=  ftf.N]. 

Case  2 :  M  =  t  for  some  variable  t  4-  /•  Then  M[f  :=  fiflN]  =  t  =  M[f  :=  fif.N]. 
Case  3 :  M  =  a  for  some  constant  a.  Similar  to  Case  2. 

For  the  induction  case,  there  are  four  cases  to  consider: 

Case  1:  M  =  cond  B  Mo  M\.  Then 

M[f  :=  fiflN]  =  (A k.B  (Am.cond  m  (Mo  k)  (Ml «)))[/  :=  fi f.N] 

=  \n.B[f  :=  fif.N]  (Am.cond  m  (Mo[f  :=  fi f.N]  k )  ( Mi[f  :=  fif.N }  k)) 
(by  the  induction  hypothesis) 

=  M[f  :=  tif.N). 

Case  2:  M  =  (M\  M 2).  Thus, 

M[f  :=  fif.W]  —  ( Xk.M\  {XmfMl  (A n.m  n  *)))[/  :=  fif.N] 

=  Xk.M\[/  :=  fif.N]  {\m.Mi[f  :=  fif.N]  (A n.m  n  «)) 

(by  the  induction  hypothesis) 

=  M[f  :=  ft  f.N). 

Case  3:  M  =  Xy.M' .  Note  that  /  /  y\  thus 

M[f  :=  fiflN]  =  (Ak.k  (A y.M'))[f  :=  fif.W] 

=  Xk.k  (Xy.M'[f  :=  fif.N]) 

(by  the  induction  hypothesis) 

=  M[f  :=  fif.N]. 

Case  A:  M  =  fig.M'.  If  g  =  /,  M[f  :=  fif.W]  =  M  =  M[f  :=  fif.N].  On  the  other 
hand,  if  g  ^  /, 

M[f:=  fif.W]  =  (A K.(ng.M')  K)[f  :=  nf.N) 

=  XK.ifig.My  :=  fif.N])  k 

(by  the  induction  hypothesis) 

=  M[f  :=  fif.N). 
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This  concludes  the  proof. 


Given  these  two  lemmas,  we  may  complete  the  proof  of  the  theorem: 

Theorem  2.7  If  M  =„  N,  then  M  =„  N . 

Proof:  By  induction  on  the  length  n  of  the  proof  that  M  —v  N.  In  the  base  case,  the 
length  of  the  proof  is  1,  so  an  axiom  was  used: 

Case  1:  (A x.M)  V  =„  M[x  :=  V],  where  V  is  a  value.  Recall  that  V  =  Xk.k  &(V). 

Therefore, 

(A x.M)  V  =v  (Xx.M))  (A m.V  (Xn.m  n  k )) 

=v  Xk.(X m.V  (Xn.m  n  k))  (Xx.M) 

=v  X k.V  (Xn.(Xx.M)  n  k ) 

—v  Xn.(Xn.(Xx.M )  n  k )  tf'(V) 

=„  Azc.(Ax.M)  *( V )  k 
=„  Ak.(M[x  :=  #00])  k 
=„  A k.M[x  :=  V]  k 

where  the  last  equation  follows  from  Lemma  2.5.  Examining  the  continuation  trans¬ 
form,  we  note  that  every  continuized  term  begins  with  a  A-abstraction;  thus, 

Ak.M[x  :=  V ]  k  =„  M [x  :=  V] 

so  (Xx.M)  V  =v  M[x  :=  V]. 

Case  2:  cond  c0  M0  Mi  =„  M0.  By  calculation, 

cond  c0  M0  Mi  =„  Xk.(Xki.ki  Co)  (Am.cond  m  (M0  k)  (Mi  k)) 

=v  A/c.cond  co  (Mo  k)  (Mi  k) 

— v  Ak.(  k)  — v  AIq. 

Case  3:  cond  cj+i  Mo  Mi  =v  Mi.  Similar  to  the  previous  case. 

Case  4:  succc;  =„  c/+i.  By  calculation, 

succ ci  =v  Ak.(Aki.«i  (Ax.A«2.«2  (succ  x)))  (Am.(AK3.«3  ci)  (Xn.m  n  k )) 

=v  Ak.(Ak3.k3  ci)  (Xn.(Xx.Xn2.K2  (succx))  n  k) 

=v  Ak.(Ax.A«2*«2  (succ  x))  Cl  K 
=v  Ak.k  (succ  ci)  =v  Xk.k  c/+i  =„  ci+i. 

Case  5:  pred  Co  =„  Co-  Similar  to  the  previous  case. 

Case  6:  pred  c/+i  =„  c/.  Similar  to  the  previous  case. 

Case  7:  nf.M  =v  M[f  :=  fif.M].  By  calculation, 

Hf.M  =v  Xk.(h/.M)k 

=v  X k.(M[/  :=  nf.W))  k 

=v  X k.(M[J  :=  nf.M])  k  =„  M[f  :=  nf.M), 

the  third  equation  following  from  Lemma  2.6. 
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Case  8:  M  =v  M.  Trivial. 


In  the  induction  case,  the  length  of  the  proof  is  n  +  1;  again,  we  divide  into  cases,  this 
time  depending  on  the  last  rule  used: 

Case  1:  M  =„  N  and  N  =„  P  implies  M  =„  P-  By  the  induction  hypothesis,  we  know 
that  M  =„  N  and  N  —v  P,  so  we  can  conclude  that  M  =v  P  by  the  transitivity  rule. 

Case  2:  M  =v  N  implies  N  =v  M.  Trivial. 

Case  3:  M  =v  N  implies  C[M ]  =„  C[1V].  Using  the  induction  hypothesis  M  =„  N,  an 
easy  structural  induction  on  the  context  C[-]  shows  that  C[M\  =„  C[1V]. 

This  list  exhausts  the  possibilities  for  last  rule  used,  hence  we  are  done.  ■ 


2. 3. 2. 2  Adequacy 

Theorem  2.7  does  not  explain  the  correspondence  of  evaluation  of  terms  and  their  cps- 
versions.  For  complete  programs  in  particular,  we  expect  the  interpreter  to  give  the  same 
answers  from  both  the  direct  and  continuized  versions,  except  that  continuized  versions 
must  be  passed  a  “default  continuation,”  viz.,  the  identity  function: 

M  ci  iff  M  ( Az.x )  -»v  c/. 

Indeed,  this  fact  must  hold  if  we  wish  to  use  cps  conversion  in  compilers.5 

The  proof  proceeds  using  the  method  in  [19].  The  key  observation  is  that  certain  reduc¬ 
tions  on  transformed  terms  have  no  corresponding  reduction  on  non-continuized  versions. 
For  example,  consider  the  complete  program  C5.  The  direct  version  cannot  be  reduced,  but 
cj  (Az.i)  can  be: 

(A k.k  c5)  (Ax.x)  (Ax.z)  c5  ->v  c5. 

The  first  reduction  is  callec  an  administrative  reduction,  since  only  a  continuation  is 
passed.  The  relation  *  applies  a  continuized  term  to  a  continuation  and  performs  all  possible 
administrative  reductions: 

Definition  2.8  For  any  term  M  :  o  and  any  value  K  :  o'  — *■  o,  we  define  M  *  K  by 

M*K  = 

J**I<  = 

(cond  B  M\  M2)  *  K  = 


(MiM2)*I{  = 


nf.M'  *  K  = 

The  following  lemma  confirms  that  the  definition  actually  represents  a  “partial  reduction” 
of  a  continuized  term: 

5Note  that  the  =>  direction  fol.ows  from  Theorems  2.2  and  2.7,  but  the  converse  does  not  follow  directly. 


K  *P(M),  if  M  is  a  value 

f(o  —*o)—*o  ^  y  j  ■ s  a  ^variable 

f  cond  5 V(B)  (M\  K)  (M2  K)  if  B  is  a  value 

(  B  *(Am.(cond  m  (Mi  1()  (M2  K)))  otherwise 

{M\  *  (Am.A/2  (An.m  n  K))  if  Mi  is  not  a  value 

M2  *  (An.lPr(Mi)  n  K)  if  Mi,  but  not  M2,  is  a  value 
^(Mi)  i'(A/2)  K  otherwise 

nf.(W  K) 
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Lemma  2.9  If  K  is  a  value,  then  M  K  M  *  K . 

Proof:  By  structural  induction  on  M.  For  the  base  case,  divide  into  cases  depending  on 
M: 

Case  1:  M  =  x.  Then  M  K  —  ( Xk.k  x)  K  — K  x  =  M  *  K. 

Case  2:  M  =  f.  Then  M  K  =  ( \n.f  k)  I(  ->v  f  K  =  Jf  *  K. 

Case  3:  M  —  c\.  Then  M  K  =  (Xk.k  c/)  K  —>v  K  C[  =  M  *  K. 

Case  4:  M  =  succ.  Then  M  K  — K  ( Xx.Xk.k  succx)  =  M  + 1(. 

Case  5:  M  =  pred.  Similar  to  the  previous  case. 

For  the  induction  case, 

Case  1 :  M  =  cond  B  M\  M2.  Then 

M  K  =  ( Xk.B  (Am.cond  m  (Mi  k)  (M2  k)))  I( 

B  (A  n.cond  m  (M[  K)  (Ufa  K)). 

If  B  is  a  value,  then  M  K  -»v  cond  &(B)  (Mi  K)  (M2  Ii);  otherwise, 

M  K  -*v  H  *  (Am.cond  m  (Mi  K)  (M2  K))  =  W  *  K 
(by  the  induction  hypothesis.) 

Case  2:  M  =  (Mi  M2).  If  M\  is  not  a  value, 

M  K  =  (A/c.Mi  (Am.M2  (An.m  n  k)))  K 
— *■„  Mi  (Am,M2  (An.m  n  K)) 

Mi  *  (Am.M2  (An.m  n  K))  =  M  *  7v 
(by  the  induction  hypothesis.) 

If  Mi  but  not  Mr  is  a  value, 

M  K  M^(An.<P(Mi)n/0 

— „  M^  *  ( An.«?(Mi )  n  /v )  =  M  *  I( 

(by  the  induction  hypothesis.) 

Finally,  if  both  Mi  and  M2  are  values, 

M  (An.tf'(Mi)  n  tf)  5p(Mi)  lP(M2)  I<  =  M  *  A'. 

Case  3:  M  =  Xx.M'.  Then 

~M  K  ■=  (Xk.k  (Xx.M7))  K  A'  (Ax.M7)  =  M  *  K. 

Case  4:  M  =  fx  f.M'.  Then 

M  K  =  (Xn.(nf.W)  k)  K  (h f.JF)  K  =  M  *  A'. 

This  concludes  the  proof  of  the  lemma.  ■ 

Once  the  administrative  reductions  on  a  continuized  term  have  been  performed,  the 
next  reductions  correspond  to  reductions  on  the  original  version  of  the  term: 
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Lemma  2.10  If  M  —*v  N  and  A  is  a  value,  then  M  ★  A  N  *  A. 

Proof:  By  induction  on  the  length  of  proof  of  M  — ►„  N .  In  the  base  case,  the  length  of 
the  reduction  is  1;  we  divide  into  cases  depending  on  the  operational  rule  used: 

Case  1:  (A x.M')  V  — M'[x  :=  V].  Then 

M  *  A  =  tf'(Ax.M')  !P(V)  A 
(KP[x:=#(V)])K 
M'[x  :=  V]  *  A 
(by  Lemmas  2.5  and  2.9) 

=  TV*  A. 

Case  2:  succc/  — c/+i.  Then 

M  *  K  -»v  i'(succ)  V(ct)  K  -»v  A  (succ  c/)  — A  cj+i  =  N  ★  A. 

Case  3:  predco  — »v  Co-  Similar  to  the  previous  case. 

Case  4:  predcj+j  — c/.  Similar  to  the  previous  case. 

Case  5:  cond  Co  M\  M2  —*v  M\.  Then 

M  *  A  —  cond  Co  ( M\  A)  (M2  A)  M\  *  A 

by  Lemma  2.9. 

Case  6:  cond  c/+1  Mi  M2  -*v  M2.  Similar  to  the  previous  case. 

Case  7:  (fi/.M1)  M'[f  :=  /i/.Af'J.  Then 

M*A  =  (nf.W)K 

->V  (AP(f  :=  nf.W])  K 

-v  Af'[/  :=  n /.M']  *  A 

by  Lemmas  2.6  and  2.9. 

In  the  induction  case  we  consider  proofs  of  length  greater  than  1,  and  divide  into  cases 
depending  on  the  last  operational  rule  used: 

Case  1:  B  — ►„  B'  implies  cond  B  M\  M2  -*v  cond  B'  M\  M2.  Note  that  B  cannot  be  a 
value;  hence  if  B'  is  not  a  value, 

M*  A  =  ~B  *  (Am.cond  m  (M[  K)  (Ml  A)) 

W *  (Am.cond  m  (M\  K )  (M2  A))  =  TV  *  A 

by  the  induction  hypothesis.  If  B'  is  a  value,  then 

A/  *  A'  ->*„  cond  V(B')  (M[  A)  (M^  A)  =  TV  *  A. 

Case  2:  P  — ►„  P'  implies  succ  P  — ►„  succ  P'.  P  cannot  be  a  value,  so  if  P/  is  not  a 
value, 

TT *  K  =  P  *(Xu.(Xx.Xk.k  (succx))  n  A) 

-»v  P'  *  (Xn.(Xx.Xn.K  (succx))  n  A')  =  N  *  A 

by  the  induction  hypothesis.  If  P'  is  a  value,  then 

M  *  A  (Ax.Ak.k  (succ  x))  tf'(P')  I(  =  N  *  A. 
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Case  3:  P  — *■„  P'  implies  pred  P  — ^v  pred  P' .  Similar  to  the  previous  case. 

Case  4:  Q  — ►„  Q'  implies  ( Xx.P )  Q  — *•„  (A x.P)  Q'.  Similar  to  the  previous  case. 

Case  5:  P  — ►„  P'  implies  P  Q  — *■„  P'  Q.  P  cannot  be  a  value,  so  if  P'  is  not  a  value, 

M*K  =  P  *  (Xm.l^  (Xn.m  n  K)) 

-»v  P'  *  ( Xm.Q  (A n.m  n  K ))  =  N  *  K 

by  the  induction  hypothesis.  If  P'  is  a  value  and  Q  is  not,  then 

M*K  Q  (Xn.V(P')  n  K ) 

T}  *  (An.^(P')  n  K)  =  N*K. 

by  Lemma  2.9.  If  both  P'  and  Q  are  values,  then 

M*I<  (3  (Xn.'P(P')  n  K) 

*{P')V(Q)  K  =  Nr*K. 

As  all  operational  rules  have  been  considered,  we  are  done.  ■ 

These  facts  about  administrative  and  non-administrative  reductions  on  continuized 
terms  give  us  the  ability  to  prove  the  following  theorem  originally  due  to  Fischer  [9]: 

Theorem  2.11  (Adequacy)  If  M  is  a  complete  program,  then 

Evalv(M)  =  ci  iff  Evalv(M  ( \x°.x ))  =  c/. 

Proof:  (=*►)  Suppose  Evalv(M)  =  c/;  then  we  know  that  M  -»v  c\.  By  Lemmas  2.9  and 
2.10  we  then  have 

M  ( Xx.x )  -»v  M  *  (Ax.x)  Ej  *  (Ax.x)  -»v  c/. 

Thus,  Evalv(M  (Ax.x))  =  c/. 

(•£=)  Suppose  Evalv(M)  is  not  defined.  Then 

M  M\  M2  ►v  ... 

By  Lemmas  2.9  and  2.10,  we  thus  know 

M  (Ax.x)  -»v  M  *  (Ax.x)  -»v  M\  *  (Ax .x)  M2  *  (Ax.x)  -»v  . . . 

so  Evalv(M  (Axc.x))  is  not  defined  either.  ■ 
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Chapter  3 

Continuations  May  Be  Unreasonable 


The  Adequacy  Theorem  establishes  a  strong  connection  between  the  evaluation  of  terms 
and  their  continuized  versions.  The  theorem  easily  extends  to  reasoning  about  complete 
programs,  viz.,  proving  observational  congruences.  It  follows  that  for  complete  programs 
M  and  N , 

M  A  iff  M  (Xx.x)  =lba  N  (Xx.x). 

The  connection  between  direct  and  continuized  versions  of  higher-order  terms  is  less  obvious, 
but  one  may  still  see  a  partial  relationship  between  reasoning  on  direct  versus  reasoning  on 
continuized  terms: 

Corollary  3.1  If  M  ="6s  A',  then  M  =lba  N. 

Proof:  Suppose  M  and  N  were  distinguishable  by  some  context  C\-\.  Then  by  the  Ade¬ 
quacy  Theorem,  the  context  C(-]  ( Xx.x )  would  distinguish  M  and  N,  a  contradiction.  ■ 

In  particular,  if  one  can  distinguish  two  terms  by  a  context,  the  transforms  of  those  terms 
will  also  be  distinguishable. 

The  problem  with  the  continuation  transform  is  that  the  converse  of  Corollary  3.1  does 
not  hold:  observational  congruence  on  direct  terms  does  not  coincide  with  congruence  on 
continuized  terms.  Similar  anomalies  occur  in  the  other  two  settings.  For  example,  suppose 
we  augment  A„  with  the  call/cc-like  operators  C  and  A  defined  in  [7,  8].  Terms  that  are 
observationally  congruent  in  Xv  may  become  distinguishable  using  contexts  containing  these 
new  operators.  In  the  case  of  continuation  semantics,  there  are  observationally  congruent 
terms  that  are  equivalent  in  a  direct  semantics  but  not  equivalent  in  a  continuation  seman¬ 
tics.  Reasoning  principles  based  on  observational  congruence  may  thus  become  unsound  in 
settings  involving  continuations. 

In  the  continuation  transform  setting,  the  anomaly  is  manifested  at  terms  of  higher 
type.  In  particular,  two  higher-order  closed  terms  may  be  observationally  congruent  but 
their  transforms  may  not  be 

Theorem  3.2  There  exist  two  closed,  pure  (i.e.,  containing  no  constants,  conditionals,  or 
recursion)  terms,  namely 

Mi  =  Xx°~>0~t0.Xy0~t0.Xz°.(Xwzc  z  w )  (y  z ) 

Mi  =  Xx°^0^0.Xy0~‘0.Xz°.x  z  (y  z), 

with  Mi  =vob>  Mi  but  ~M~i  £voba  W2. 
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Proof:  To  show  M\  ="ia  M2,  we  proceed  in  a  purely  operational  fashion  using  Theo¬ 
rem  2.3. 1  We  first  show  that  Mj  M2.  Pick  any  values  Vi,  V2,  and  V3 — then  M2  -»v  V7, 
M2  Vi  -»v  V"  and  M2  Vi  V2  V'",  so  all  vectors  V  of  length  0, 1,  or  2  make  the  statement 

Mi  V  V0'  implies  M2  V  V/ 

hold.  Now  suppose  Mi  V\  V2  V3  cj.  Then 

Mi  Vi  V2  V3  (Ad.Vi  V3  d)  (V2  V3)  c, 

so  it  must  be  the  case  that  V2  V3  -»v  V'  and  Vi  V3  V"  for  some  values  V'  and  V". 
Therefore, 

MiVrViVs  VxVz(y2Vz) 

V"V' 

~»v  Cl. 

Thus,  by  Theorem  2.3,  Mi  <v  M2.  Using  a  similar  argument,  one  can  show  M2  <v  M\. 

To  show  that  M\  ^”ba  M2,  we  first  reduce  Mi  and  M2  using  =„: 

Mi  =„  Xkq.ko  (Ai.Aki.ki  (Aj/.Ak2.K2  (Az.Ak3.j/  z  (A n.x  z  (A m.m  n  k3))))) 

M2  =„  Xko.ko  (\x.\kx.kx  (\y.\n2.K2  (\z.\k3.x  z  (Xm.y  z  (Xn.m  n  K3))))) 

(where  the  types  have  been  omitted  for  clarity.)  Intuitively,  the  difference  between  Mi  and 
M2  comes  from  a  difference  in  the  way  Mi  and  M2  are  reduced  when  applied  to  arguments: 
Mi  evaluates  (y  z )  first,  while  M2  evaluates  (1  z)  first.  The  typable  context 

C[-]  =  [•]  No ,  where 

No  =  Xp.p  (Aa.A6.Ci)  Nx,  where 

Nx  =  X q.q  (Aa.A6.c2)  N2,  where 

N2  =  Ar.r  ci  (Aa.a) 

distinguishes  Mx  and  M2,  since  C(Mi]  terminates  with  result  c2  and  C[Mi ]  terminates  with 
result  ci: 

C[Mi]  =„  ( Xa.Xbx2 )  Ci  (An.(Aa.A6.Ci)  ci  (A m.m  n  (Aa.a))) 

-V  c2 

C[M2 ]  =„  (Aa.A6.C1)  ci  (Am.(Aa.A6.C2)  ci  (A n.m  n  (Aa.a))) 

—  v  Ci. 

Thus  Wx  &ob,  W2 }  U 

Using  a  marked  language  (c/.  Appendix),  one  can  show  that  the  untyped  versions  of  M\ 
and  M2  are  congruent  in  any  untyped  context.  Nevertheless,  a  simple  typable  context  using 
only  numerals  distinguishes  their  transforms. 

’Other  techniques  exist  for  verifying  congruences:  one  may  rely  upon  either  an  adequate  or  fully-abstract 
denotations]  semantics  or  upon  an  equational  system  sound  for  =^(  yet  strong  enough  to  prove  the  congru¬ 
ence  [12,  20],  Either  method  rests  upon  a  nontrivial  adequacy  or  soundness  proof.  Plotkin  [18]  claims  both 
methods  can  be  used  to  prove  Mi  =vob,  A/j,  using  either  pre-domains  [22]  or  Moggi’s  Ap  [16],  but  I  have  not 
worked  through  the  proofs  of  adequacy  of  the  pre-domain  semantics  or  soundness  of  Xp  for 
2In  fact,  a  stronger  statement  is  true:  Mi  -£v  Afj  and  A/j  2?®  Mi. 
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The  Adequacy  Theorem  clarifies  why  M\  ^b>  M2 :  a  context  with  “illegal”  continuations 
distinguishes  the  continuized  terms.  One  could  sensibly  argue  that  M\  and  M2  should  not 
be  distinguished,  since  the  distinguishing  context  will  never  arise  under  the  intended  uses 
of  M\  and  M2.  But  granting  this,  the  theorem  nevertheless  points  out  a  legitimate  concern: 
what  methods  shall  we  use  to  prove  that  two  terms  are  congruent  with  respect  to  all  “legal” 
contexts,  and  what  exactly  are  the  legal  contexts?  This  question  might  arise  if  we  wanted 
to  justify  a  post-transform  code  optimization  in  which  transformed  code  M  was  replaced 
by  an  “optimized”  expression  N  equivalent  to  M  in  all  legal  contexts.  For  any  No,  N  itself 
need  not  equal  No- 

It  is  not  surprising  that  Theorem  3.2  has  an  analog  in  the  call/cc  setting.  Consider, 
for  example,  the  language  Xc  with  the  call/cc-like  operator  C  and  the  abort  operator  A 
[6,  7,  8].  More  precisely,  Ac  has  the  same  syntax  as  the  untyped  version  of  A„  (t.e.,  where  no 
variables  are  decorated  with  types,  and  terms  need  not  be  well-typed),  with  the  additional 
terms  C  M  and  A  M .  The  reduction  relation  for  Ac,  — >c,  is  defined  by  the  rules 

(A  M)  N  -» c  AM  ( C  M)N  -+ c  C  ( Xk.M  (A  m.K  (m  N))) 

V  (A  M)  A  M  V  ( C  M)  - c  C  (A k.M  (Xv.k  (V  v))j 

and  the  outermost  computation  rules  (which  are  only  applicable  in  empty  contexts) 

AM  >c  M  CM  >c  M  (A  x.A  x ) 

in  addition  to  the  (untyped  versions  of)  rules  of  — Let  -»c  be  the  reflexive,  transitive 
closure  of  (— *c  U  >c),  and  let  =£6j  denote  the  observational  congruence  relation  on  terms  of 
Ac  when  observing  numerals.  Then 

Theorem  3.3  If  M\  and  M2  are  the  terms  above,  M\  ^cobs  M2- 

Proof:  Let  C[-]  be  the  context  [■]  (Ax.fi)  (A y£  (Ax.Ci))  cj.  Here,  fi  is  any  divergent  term 
(such  as  nf.f.)  This  context  forces  C[M2]  to  diverge  but  makes  C[Mi]  converge  to  cj: 

C[Mj]  -»c  (Au;.( Ax.fi)  ci  tn)  ((XyjC  (Ax.c,))  c\) 

-»c  (Au;.( Ax.fi)  ci  w)  (C  (Ax.ci)) 

-*c  C  (Ak.(Ax.ci)  (Xv.k  ((Xw.( Ax.fi)  ci  u;)  u))) 

>c  (Ak.(Ax.ci)  (Au.k  ((Au;.( Ax.fi)  ci  w)  r)))  (Ax^4  x) 

-»c  ( Xx.ci)  (Av.(Ax^4  x)  ((Atn.( Ax.fi)  c\  w)  v)) 

-»C  Cl 

C(M2\  -»e  ((Ax.fi)  Cl)  ((XyjC  (Xxxi))  Cj) 

-»c  fi  ((Xy£  (Ax.ci))  ci) 

-*c  fi  ((*y£  (Ax.ci))  ci) 

-*C 

Thus,  Mi  p0ht  M2.  ■ 

The  particular  terms  Mi  and  M2  can  also  be  used  to  point  out  problems  with  continu¬ 
ation  semantics.  If  one  bases  the  semantics  of  A„  on  the  transform,  :.e.  the  meaning  of  a 
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term  M  is  the  meaning  of  M  in  some  well-chosen  model,  two  terms  may  be  observationally 
congruent  but  fail  to  be  equivalent  in  the  model.  The  terms  M\  and  M2  again  provide  the 
desired  example. 

Less  contrived  examples  appear  in  the  literature.  Mey  jr  and  Sieber,  for  instance,  point 
out  that  two  Algol  blocks  may  be  observationally  congruent  but  not  congruent  if  goto 
statements  are  allowed  [14].  Since  jumps  are  usually  definable  in  a  continuation  semantics, 
the  two  blocks  will  not  be  semantically  equivalent.  Reasoning  principles  based  on  a  con¬ 
tinuation  semantics  may  thus  lead  one  to  conclude  facts  that  are  not  true  about  the  actual 
behavior  of  code. 

The  failure  of  familiar  reasoning  principles  seems  to  be  known  (albeit  informally)  in 
the  community  of  compiler  designers.  In  the  presence  of  control  operators  or  cps-converted 
code,  typical  compiler  optimizations  are  unsound  and  procedure  calls  are  often  treated  as 
“black  holes.”  But  one  need  not  conclude  from  the  failure  of  some  reasoning  principles  that 
the  situation  for  continuations  is  a  black  hole.  There  are  interesting  reasoning  principles 
which  hold  in  continuation  settings.  For  example,  consider  the  Xv  terms 

Pi  =  Xa.Xb.(Xx.x)  ((Xy.y)  (a  b)) 

P2  =  Xa.Xb.(Xx.x)  ( a  b) 

that  are  not  provably  equivalent  using  =„.  In  Ac  these  two  terms  are  observationally 
congruent,  a  fact  proven  by  Felleisen  [5]  who  has  developed  further  principles  for  proving 
observational  congruences  in  this  setting.  A  setting  involving  continuations  seems  to  require 
a  new  theory  for  reasoning  about  code.  Such  a  theory  remains  to  be  found. 
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Chapter  4 

Conclusion 


Reasoning  about  the  behavior  of  cps-converted  code  requires  additional  assumptions  if  con¬ 
verted  terms  are  to  behave  as  their  direct  versions.  Theorem  3.2  makes  this  formal:  con¬ 
tinuations  not  arising  in  continuized  contexts  may  distinguish  cps-converted  terms.  Two 
possible  approaches  for  a  theory  of  continuations  may  be  based  upon  this  observation. 

One  approach  to  a  theory  of  continuations  attempts  to  capture  the  notion  of  a  “legal” 
continuation.  An  algebraic  method  along  these  lines  is  developed  in  [15]  using  retractions.1 

Definition  4.1  (Informal)  A  retraction  pair  (i,j)  is  a  pair  of  functions  such  that  for 
any  x,  j  ( i  x)  =  x. 

Meyer  and  Wand  define  retraction  pairs  (at  all  types)  that,  when  applied  to  a  continuized 
term,  supply  the  right  continuations  at  the  right  time.  Specifically,  in  the  simply-typed, 
call-by-name  A-calculus  with  no  constants  (A„,  with  /3q  equational  reasoning  =„),  Meyer 
and  Wand  prove 


Theorem  4.2  (Meyer,  Wand)  For  any  type  a,  there  exist  \n-definable  retraction  pairs 
( ia,ja )  and  ( Ia,Ja ),  where  ia  :  a  -*  a'  ,  ja  :  a1  -*•  a,  Ia  :  a'  -+  ((o'  -*  o)  -*  o),  and 
Ja  :  ((a'  — ►  o)  — »  o)  — ♦  a’,  namely 


la 

J \ 


\xa'.\na'~~°.K  X 

(A a°.a)  if  a  —  o 

Ax(°  .Akt  ~'°.x  (A  aa'.a  bn)  if  a  =  a  — ►  r 

if  a  =  o 

(x  (ja  <*)))  if  a  =  a  —*  r 
if  a  =  o 

)))  if  a  =  0  —>  t 


a  =  { 

_  f  Ax°.x 

=  \  \x°~T.\ a°'.Ir  (»r 

_  f  Ax°.x 

,a  =  {  Ax<‘r“,T)'.A a°.jr  (JT  (x  (iV  a 


Moreover ,  M  =n  ja(Ja  M)  for  any  closed,  pure  term  M. 


By  applying  the  retractions,  one  can  thus  recover  the  meaning  of  a  direct  term  from  its 
continuized  form.2 

‘inclusive  predicates  have  also  been  used  to  establish  connections  between  the  direct  and  continuation 
semantics  of  a  language  [24,  26,  29].  The  inclusive  predicate  approach  seems  necessary  in  cases  where  the 
denotational  domains  are  built  re:ursively. 

JEven  in  the  simplified  setting  of  A„,  we  cannot  expect  to  have  Af  =  i(M)  for  anyi.  This  follows  because 
there  are  two  pure,  closed  terms  Af,  N  where  Af  =n  jV  but  Af  and  N  A„-convert  to  distinct  normal  forms, 
namely  the  terms  Af  =  Aa.A6.Ac.(Ar.a)  (6  c)  and  N  =  Xa.Xb.Xc.a,  If  b  *( Af )  =  Af  and  I-  t{N)  =  N,  then  it 
would  follow  that  h  Af  =  N  which,  by  Statman’s  typical  ambiguity  theorem  [27],  is  equationally  inconsistent. 
In  the  case  of  A„,  we  similarly  cannot  have  Af  =  * ( Af )  for  any  t  by  Theorem  3.2. 
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Theorem  4.2  can  be  misleading  as  soon  as  recursion  is  added  to  the  language.  In  the 
pure  simply-typed  calculus,  call-by-name  and  call-by-value  convertibility  coincide  since  no 
term  causes  a  divergent  computation  [2].  Because  call-by- name  equational  reasoning  is  not 
sound  for  the  observational  congruence  theory  of  A„,  the  retraction  pairs  above  may  not  be 
appropriate  for  A„.  In  fact,  the  retraction  pairs  are  no  longer  retractions:  one  can  only  show 
that  M  <v  Ja(Ia  M)  and  M  jQ(ia  M).  We  conjecture  that  a  similar  reformulation  of 
Theorem  4.2  holds.3 

Conjecture  4.3  For  any  closed  term  M  of  type  a,  M  ja(Ja  M ). 

This  conjecture  does  not  hold  if  we  reverse  the  <v: 

Theorem  4.4  Let  a  =  (o  —*  o  —>  o  — <•  o)  — »  (o  — >  o)  — ►  o  — ►  (o  — +  o)  and  let 


S  =  Xx.Xy.Xz.x  z  ( y  z) 
be  of  type  a.  Then  ja(Ja  S )  -£v  S. 

Proof:  In  the  proof  of  Theorem  3.2,  we  saw  that 

S  =v  Xko.k0  (Ai.Aki.Kj  (Xy.XK2.K2  (Xz.Xk3.x  z  (Xm.y  z  ( An.m  n  K3))))) 

Using  the  fact  that  (i  V)  is  =v  to  a  value,  we  can  find  a  simpler  form  for  j(J  S ): 

ja(Ja  S )  =„  ja(Xx.XKi.Ki  ( Xy.XK2.K2  (XZ.XK3.X  z  (Xm.y  z  (An.m  n  K3))))) 

=„  Xa\.j(J  (Aki-Kj  (Xy.XK2.K2  (Xz.XK3.(i  aj)  z  (Xm.y  z  (A n.m  n  K3)))))) 

=„  Xa\.Xa2.j(J  (XK2.K2  (Az.Ak3.(i  oj)  z  (A m.(i  02)  z  (A n.m  n  K3))))) 

=u  Xa\.Xa2.Xa3.j(J  (Ak 3.(i  ai)  (i  03)  (Am.(i  a2)  (t  03)  (An.m  n  k3)))) 
Aa1.Aa2.Aa3.Aa4. 

jo(Jo  (A K.(i  ai)  (i  03)  (Am.(t  a2)  (i  03)  (An.m  n  (Aa.a  (t  a4)  k))))) 


Thus,  in  the  typable  context 

C[-]  =  (Ai.ci)([-](Aa.n)U,  V2) 

where  V\  and  U2  a-re  closed  values,  C[5]  does  not  halt  but  C[j(J  5)]  -»v  c\.  ■ 

It  also  remains  open  whether  there  is  a  Av-definable  j  .uch  that  M  =vob>  j(M )  or  even 
whether  an  interpretation  of  such  a  j  exists  in  one  of  the  standard  semantical  models  of  Xv. 

Another  approach  to  a  theory  of  continuations  involves  finding  general  methods  for 
proving  observational  congruences  like  P\  and  P2.  A  theory  in  this  spirit  might  exploit  the 
analogy  between  the  three  settings  of  continuation  transform,  continuation  semantics,  and 
call/cc-like  congruence.  W;  conjecture  that  a  precise  match  may  be  found  among  them. 

Conjecture  4.5  For  appropriate  choice  of  direct  semantics  D\  \,  continuation  semantics 
C  [I,  continuation  transform  M ,  and  observational  congruence  relation  =cobs  using  call/cc- 
like  operators  in  contexts, 

M  =lb,  N  iff  D[M)  =  Z)[7V] 
iff  C[M ]  =  C[N) 
iff  M  =%bs  N. 

3The  announcement  in  [13]  of  this  result  is  withdrawn. 
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Establishing  this  conjecture  clearly  requires  finding  a  suitably  matched  triple  of  transform, 
continuation  semantics,  and  call/cc-like  operators,  e.g.,  we  obviously  must  not  try  to 
match  up  a  call-by- value  transform  with  a  call-by-name  direct  semantics  of  a  language  with 
call/cc-like  operators. 

Developing  reliable  principles  for  reasoning  about  continuations  is  the  ultimate  goal  of 
this  research,  and  it  is  unclear  (at  this  time)  which  of  these  two  approaches  will  yield  general 
principles.  Both  avenues  are  being  pursued. 
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Appendix  A 

Standard  Theorems  for  the  Language 


The  appendix  is  a  compendium  of  some  standard  facts  about  the  language  A„.  Similar 
results  appear  in  [2,  19,  20]  for  call-by-name  languages;  the  techniques  for  proving  these 
facts  carry  over  largely  to  the  case  of  A„.  The  results  are  stated  and  proved  with  little 
comment. 


A.l  Church- Rosser  Theorem 

We  follow  the  proof  in  [2],  using  a  technique  due  to  Tait  and  Martin-Lof. 

Definition  A.l  The  relation  =>p,  the  parallel  reduction  relation,  is  defined  inductively 
as  follows: 


M 

SUCC  Cj 


■p  M 
p  cj+i 

P=»pP; 


pred  c0  =S> 

pred  cj+i  =>p 


co 


cond  Co  P  Q  =>p  P' 

B  =»p  B',  P  =>p  P',  Q  =>p  Q' 
cond  B  P  Q  =>p  cond  B'  P'  Q' 

M  =>p  M',  A  =>p  A' 

M  N  =>p  M'  A' 


M  =>p  M' 


Q  =>v  Q' 


cond  c/+i  PQ  =>pQ' 

M  =»p  M' 
Xx.M  =>-p  Aar ,M' 


M  =>p  M\  A 


{Xx.M)  A  =*p  M'[x  :=  V] 


M 


M',  nf.M  A 


nf.M  =j>p  nf.M'  nf.M  =»p  M'[f  :=  A] 

Lemma  A. 2  If  A  =>p  N'  and  v  is  any  variable,  then  M[v  :=  A]  =>p  M[v  :=  A']. 

Proof:  By  structural  induction  on  M .  There  are  two  cases  to  consider  in  the  base  case: 

Case  1:  M  =  v;  then  M[v  :=  A]  =  A  A'  =  M[v  :=  IV7]. 

Case  2:  M  =  v'  for  v'  some  constant  or  variable  not  equal  to  x.  Then 

M[v  :=  A ]  =  v'  =>p  v'  =  M[v  :=  A']. 

There  are  six  cases  in  the  induction  case: 

Case  1:  M  =  X w.P;  then  M[v  :=  A]  =  M  =>p  M  =  M[v  :=  A7]. 

Case  2:  M  =  \iv.P\  then  M[v  :=  A]  =  M  =>p  M  =  M[v  :=  A']. 
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Case  3:  M  =  Xy.P.  By  induction,  P[x  :=  A]  =»p  P[x  :=  A'];  thus, 

M[x  :=  A]  =»p  M[x  :=  A']. 

Case  4:  M  =  pf.P.  Similar  to  the  previous  case. 

Case  5:  M  =  cond  Pi  P2  P3;  by  induction,  Pi[x  :=  AM  =>p  P,[x  :=  AT'].  Thus, 

M[x  :=  AT]  =»p  M[x  :=  A']. 

Case  6:  M  =  (Pi  P2);  by  induction,  Pi[x  :=  A]  ^ p  P,[x  :=  A'],  so 

M[x  :=  A]  =»p  M[x  :=  A']. 

This  completes  the  proof.  ■ 


Lemma  A. 3  Suppose  M  =$>p  M1  and  A  =>p  A'.  If  v  is  a  X-variable  and  A  is  a  value,  then 
M[v  :=  A]  =>-p  M'[v  :=  AT7].  If  v  is  a  p-variable,  then  M[v  :=  A ]  =>p  A/'[u  :=  A']. 

Proof:  By  induction  on  the  definition  of  M  =»p  M1 .  In  the  base  case,  there  are  four  cases: 
Case  1:  M'  —  M.  By  Lemma  A. 2,  M[v  :=  A]  =>p  M'[v  :=  A']. 

Case  2:  M  =  sue ccj  and  M'  =  cJ+i .  Then  M[v  :=  A/]  =  M  =>p  M'  —  M'[v  :=  A']. 

Case  3 :  M  —  pred  co  and  Mi  =  Co-  Similar  to  the  previous  case. 

Case  4:  M  =  predcJ+i  and  Mi  =  Cj.  Similar  to  the  previous  case. 

This  completes  the  base  case.  In  the  induction  case,  there  are  ten  cases: 

Case  1:  M  =  cond  c0  P2  P3  and  M1  =  PJ.  By  induction,  P2[u  :=  A]  =>p  P?[v  :=  A']. 
Thus,  M[v  :=  A]  =>p  M’[v  :=  A']. 

Case  2:  M  =  cond  cj+i  P2  P3  and  A/'  =  Pg.  Similar  to  the  previous  case. 

Case  3:  M  =  cond  Pi  P2  P3  and  M'  =  cond  P{  P2  Pg.  By  induction,  we  know  that 

Pi[v  :=  A]  =>p  P([v  :=  A'].  Thus,  M[v  :=  A]  =>p  M'[v  :=  A']. 

Case  A:  M  —  Xx.P  and  M'  —  Xx.P' .  If  v  =  x,  then 

M[v  :=  A]  =  M  =>p  M'  =  M'[v  :=  A']. 

lfv^x,  then  by  induction  P[v  :=  A]  =>p  P'[v  :=  A'],  so  M\v  :=  A]  =>p  M'[v  :=  A']. 
Case  5:  M  =  P  Q  and  M'  =  P'  Q'.  Similar  to  Case  3. 

Case  6:  M  =  (Xv.P)Q  and  M'  =  P'[v  :=  £?'],  where  P  =>p  P',  Q  =>p  Q',  and  Q'  is 
a  value.  By  the  induction  hypothesis,  Q[v  :=  A]  =>p  Q'[v  :=  AT'].  Also,  since  v  is  a 
A-variable,  A  must  a  value,  so  Q'[v  :=  A']  must  be  i  value.  We  can  thus  use  the  rules 
of  =>p: 

M\v  :=  A]  =*p  P'[v  :=  Q'[v  :=  A']]  =  M'[v  :=  A']. 

Case  7:  M  =  (Xx.P)Q  and  M'  =  P'[x  :=  Q'],  where  v  ^  x,  P  =>-p  P',  Q  =»p  Q',  and 
Q'  is  a  value.  By  the  induction  hypothesis,  P[v  :=  A]  =»p  P'[v  :=  A']  and  similarly 
for  Q.  If  v  is  a  A-variable,  then  Q'[v  :=  Ar']  is  a  value  since  A  is  a  value  by  hypothesis; 
if  v  is  a  /^-variable,  Q'[v  :=  A']  is  a  value  no  matter  what  A  is  since  Q'  is  a  value. 
Thus, 

M[v  :=  A]  =►„  P'[v  :=  A'][x  :=  Q'(v  :=  A']]  =>v  P'[x  :=  Q'][u  :=  A']  =  M'[v  :=  A']. 
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Case  8:  M  —  nf.P  and  A/'  =  fif.P'.  Similar  to  Case  4. 

Case  9:  M  =  fiv.P  and  M'  =  P'[v  :=  Q'],  where  P  =>p  P',  M  =$p  Q'.  Note  that  v 
cannot  be  free  in  Q',  since  it  is  not  free  in  M.  Thus, 


M[v  :=  N]  =  M  M’  =  M'[v  :=  TV']. 


Case  10:  M  =  nf.P  and  M'  =  P'[f  :=  Q'],  where  f  v,  P  =>p  P'  and  M  =>p  Q' .  By 
induction,  M[v  :=  N }  =>p  Q'[v  :=  N']  and  P[v  :=  N]  ^ p  P'[v  :=  N'].  Thus, 

M[v  ;=  jV]  =>p  P'[v  :=  N')[f  :=  Q'[v  :=  A']]  =  P'[f  :=  Q'][u  :=  N')  =  M'[v  :=  JV']. 

This  completes  the  proof.  ■ 


Lemma  A. 4  The  relation  =>p  is  Church- Rosser. 

Proof:  Suppose  M  =>p  M\  and  M  =>■ p  M^.  To  show  that  there  is  an  M3  with  Mi  =>p  M3 
and  M2  =>p  M3,  proceed  by  induction  on  the  proof  of  M  =>p  Mi.  In  the  base  case,  there 
are  four  cases: 

Case  1:  M\  =  M.  Pick  M3  =  M2;  this  satisfies  the  conditions. 

Case  2:  M  =  succ  Cj  and  Mi  =  Cj+i .  Pick  M3  =  Cj+i ;  since  M2  can  only  be  M  or  Mi , 
this  choice  of  M3  suffices. 

Case  3:  M  =  pred  c0  and  Mi  =  Co-  Pick  M3  =  Co;  as  with  the  previous  case,  this  M3 
meets  the  conditions  since  M2  can  only  be  M  or  Mi. 

Case  4:  M  =  predcj+i  and  Mi  =  cj.  Pick  M3  =  c2;  again,  this  choice  suffices. 

This  completes  the  base  case.  In  the  induction  case,  there  are  eight  cases  to  consider: 

Case  1 :  M  —  cond  Co  P2  P3  and  Mi  =  P2.  Then  M2  .s  either  P2'  or  cond  Co  P2'  P%.  By 

the  induction  hypothesis,  there  is  a  P2"  with  P2  =>p  P2"  and  P2'  =>p  P2".  Then  picking 

M3  to  be  P2"  works. 

Case  2:  M  —  cond  c/+i  P2  P3  and  M'  =  PJ.  Similar  to  the  previous  case. 

Case  3:  M  =  cond  Px  P2  P3  and  Mi  =  cond  P[  P2  P3,  where  P,  =>p  P/.  Then  M2 

is  either  P2  ,  P3 ,  or  cond  P"  P2  P^'.  By  the  induction  hypothesis,  there  are  P"'  with 
P/  =>p  P["  and  P/'  =>p  Pf".  Then  picking  M3  to  be  either  P'",  P"',  or  cond  P"'  P'"  P"' 
(as  appropriate)  works. 

Case  4:  M  =  Ax.P  and  M\  —  Xx.P',  where  P  =>p  P' .  Then  M2  must  also  be  of 
the  form  Xx.P".  By  induction,  pick  P'"  where  P'  =>p  P‘"  and  P"  =>p  P"'.  Then 
M3  =  Xx.P"'  will  work. 

Case  5:  M  =  (Xx.P)Q  and  Mi  =  P'[x  :=  Q'],  where  P  =>p  P',  Q  =>p  Q',  and  Q '  is  a 
value.  There  are  two  subcases: 

Subcase  i:  M2  =  (A x.P")Q".  By  induction,  there  are  P"'  and  Q'"  with  P'  =>p  P'" 
and  P"  =>p  P'"\  pick  Q'"  similarly.  Since  Q'  is  a  value,  Q'"  must  also  be  a  value. 
Pick  M3  =  P"'[x  :=  Q"')\  M2  =»p  M3  easily,  and  M\  =>p  M3  by  Lemma  A.3. 
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Subcase  ii:  M2  =  P"[x  :=  Q").  By  induction,  there  are  two  terms  P'"  and  Q'" 
with  P'  =>p  P"'  and  P"  =>p  P"';  pick  Q'"  similarly.  Since  Q'  is  a  value,  Q"'  must 
also  be  a  value.  Pick  M3  =  P"'[x  :=  Q"']\  then  both  M2  =>p  M3  and  M\  =*>p  M3 
by  Lemma  A.3. 

Case  6:  M  =  P  Q  and  Mi  =  P'  Q',  where  P  =>p  P'  and  Q  =>p  Q'.  There  are  two 
subcases: 

Subcase  i:  M2  =  P"  Q" ■  By  induction,  pick  P"'  with  P'  =>p  P'"  and  P"  =>p  P'"\ 
pick  Q'"  similarly.  Then  M3  =  P'"  Q"'  works. 

Subcase  ii:  P  =  A x.R  and  M2  =  R"[x  :=  Q"],  for  Q"  a  value.  Then  P'  —  A x.R' . 
By  induction,  pick  Rl"  with  R'  =j «p  R'"  and  R"  =^p  R"'\  pick  Q similarly.  As 
above,  note  that  Q"'  must  be  a  value.  Picking  M3  to  be  R"'[x  :=  Q "'}  works, 
since  M\  =>p  M3  easily  and  M2  =>p  M3  by  Lemma  A. 3. 

Case  7:  M  =  ( nf.P )  and  Mi  =  Ixf.P'. 

Subcase  i:  M2  =  nf.P".  By  induction,  pick  P'"  as  before;  then  M3  =  / if.P 
works. 

Subcase  ii:  M2  =  P"[f  :=  Q"],  where  P  =>p  P"  and  M  =$-p  Q".  By  induction, 
pick  P'"  as  before,  and  let  M3  =  P"'[f  :=  Q"]\  M\  =>p  M3  by  the  rules  of  =>p, 
and  M2  =>p  M3  by  Lemma  A. 3. 

Case  8:  M  =  (fif.P)  and  Afj  =  P'[f  :=  Q'\,  where  P  =>p  P'  and  M  =>p  Q' . 

Subcase  i:  M2  =  nf.P".  By  induction,  pick  P'"  as  before  and  pick  Q'"  where 
Q'  =>p  Q'"  and  M2  =>p  Q'".  Then  M3  =  P"'[f  :=  Q "']  works,  since  Mx  =>p  M3 
by  Lemma  A.3  and  M2  =>p  M3  by  the  rules  of  =>p. 

Subcase  ii:  Af2  =  P"[f  :=  Q"],  where  P  =>p  P"  and  M  =>p  Q".  By  induction, 
pick  P"'  and  Q"'  as  before,  and  let  M3  =  P'"[f  :=  Q"'\,  then  Mi  =>p  M3  and 
M2  =>p  M3  by  Lemma  A.3. 


Definition  A. 5  M  =>v  N  iff  M  =v  N  using  no  instance  of  the  symmetry  axiom. 

Lemma  A.6  M  =>*  N  iff  M  =>„  N. 

Proof:  Let  ~ZV  be  the  relation  of  doing  0  or  1  =„  steps  without  using  the  symmetry  axiom. 
When  treated  as  sets,  the  relations  satisfy 

Q=>P  c  =>„ . 

Since  =>„  is  the  transitive  closure  of  ~tv,  it  is  also  the  transitive  closure  of  =>p.  ■ 


Theorem  A. 7  The  relation  =>•„  is  Church- Rosser. 

Proof:  Since  =>p  is  Church-Rosser,  its  transitive  closure  =>*  is  also  [2].  By  Lemma  A.6, 
=*„  is  Church-Rosser.  ■ 
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The  most  important  consequence  of  the  Church- Rosser  theorem  is 
Theorem  2.2  If  M  =v  N,  then  M  =lba  N. 

Proof:  If  M  =>v  N ,  then  C[M ]  C[iV].  Thus,  if  either  C[M ]  or  C[iV]  reduce  to  c;  under 
both  of  them  will  by  Theorem  A. 7.  The  theorem  then  follows  by  an  easy  induction  on 
the  number  of  occurrences  of  the  symmetry  rule.  ■ 


A. 2  Applicative  Congruence 

At  each  step,  the  relation  — *■„  reduces  only  one  subterm.  We  call  that  subterm  the  active 
subterm  [20].  An  examination  of  the  operational  rules  indicates  that 

Definition  A.8  The  active  subterm  of  a  non-value,  closed  term  M  is 

•  M  if  M  is  of  the  form  (succ  ci),  (predcj),  (cond  c/  M0  M\),  (pf.Mo),  or  ((Ax.Afo)  V) 
for  V  a  value;  or 

•  The  active  subterm  in  M',  where  M'  is  closed  and  not  a  value,  if  M  has  the  form 
(succ  M'),  (pred  M'),  (cond  M'  M0  Mi),  (M'  Af0),  or  ((A x.M0)  M'). 

This  definition  matches  the  informal  description  of  what  the  active  subterm  should  be: 

Lemma  A.9  Let  M  be  a  closed  subterm  of  a  non- value,  closed  term  C[M\,  where  M 
contains  the  active  subterm  of  C[M]  and  C[*]  has  only  one  hole.  Then  if  M  -*v  M' , 

C[M ]  C[A/']. 

Proof:  An  easy  structural  induction  on  C[-].  ■ 

Lemma  A. 10  Let  M  be  a  closed  subterm  of  a  non-value,  closed  term  C[M\,  where  M 
contains  the  active  subterm  of  C[M]  and  C[-]  has  only  one  hole.  Then  if  M  -»v  M' , 

C[M]  C[M'). 

Proof:  By  induction  on  n,  where 

Af  —  Mq  ^  A I\  ^  A/2  *  v  ...  Mn  —  M  . 

The  base  case,  where  n  =  0,  is  trivial,  so  we  proceed  to  the  induction  case.  By  the  induction 
hypothesis,  C[Afo]  -»v  C[Afn-i]-  A  structural  induction  on  C[-)  shows  that  Mn-\  contains 
the  active  subterm  in  OfA/,,-.!];  thus,  by  Lemma  A.9,  C[Af„_j]  -+v  C[Afn]  so  the  lemma 
holds.  ■ 


Lemma  A. 11  (Activity)  Let  M  be  a  closed  term  of  type  o  and  C[-\  be  a  closed  context 
with  holes  of  type  o.  Then  C[M ]  -»v  ci  iff  either 

1.  C[M')  -»v  ci  for  any  AT;  or 

2.  (Ax.C[xJ)  M  -»v  ci. 
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Proof:  (=>)  If  M  is  a  value,  condition  (2)  holds  immediately.  So  suppose  M  is  not  a  value. 
We  use  a  marking  technique  due  to  Bard  Bloom.  Add  to  the  language  \v  the  term  #Af  for 
any  term  M,  and  add  the  reduction  rules  (and  only  these  rules) 

#1/  V,  V  a  value 

Af  — N 
#M  #JV 

to  the  definition  of  the  — relation.  Note  that  these  rules  do  not  change  the  computational 
behavior  of  A„,  i.e.,  M  c/  iff  erase(M)  -»v  ci  where  erase(Jl/)  is  the  result  of  erasing  all 
marks  in  M. 

Proceed  by  induction  on  the  number  of  occurrences  of  #M  in  C[#Af].  The  base  case 
(n  =  0)  is  trivial.  In  the  induction  case,  suppose  that  C[#M]  cj.  Let  C'  be  the  first 
term  whose  active  subterm  is  contained  in  a  subterm  of  #M ;  if  there  is  no  such  C\  then 
condition  (1)  holds.  Let  C'  =  where  D[-\  has  one  hole.  Since  #Af  V  for 

some  unmarked  value  V,  using  the  version  of  Lemma  A.  10  for  the  marked  language  we 
conclude  that  C'  D[V],  Note  that  there  is  a  context  £[•]  with  n  —  1  holes  such  that 
D[V]  =  E[#M].  The  context  £[•]  has  the  property  that 

(A x.C[x))M  (Ax.C[xj)  V 
-v  C[V]  =  E[V]. 

By  the  induction  hypothesis,  either  E[M']  -»v  c/  for  any  M'  or  (Ax.£[x])  M  -~v  c;.  If  the 
first  condition  is  true,  then  (Ax.C[x])  M  E[V ]  -<»v  c/  so  (Ax.C[x])  M  -»v  Ci.  If  the  second 
condition  is  true,  then  (Ax.C[xj)  M  ~»v  jE7[V]  -»v  c/  since  (Ax.£[x])  M  -»v  E[ V]  -»v  ci. 

(<=)  Suppose  (Ax.C[x])  (#Af)  -»v  c/.  Again,  proceed  by  induction  on  the  number  of 
occurrences  of  #A/  in  C[#M].  The  base  case  (n  =  0)  is  trivial,  so  consider  the  induction 
case.  Examine  the  reduction  sequence  for  C[#Af],  and  pick  the  first  C'  whose  active 
subterm  is  contained  in  a  #Af;  if  there  is  no  such  C',  then  C[M']  -»v  ci  for  any  M'  so 
C[M ]  c/.  Let  C'  =  D[#M),  where  /?[•]  is  a  context  with  one  hole  and  #M  contains  the 
active  subterm  in  D[#M].  Then 

£[#M]  D[V]  =  E[#M) 

where  V  is  a  value  with  #Af  V  and  £[•]  is  an  unmarked  context  with  n  —  1  holes.  Since 

(Ax.E(xJ)  (#M)  c, 

by  the  induction  hypothesis  £[#M]  -»v  c/.  Since  C[#Af]  E[#M],  C[#M]  -»v  ci.  ■ 


Lemma  A. 12  Let  Vo  and  V\  be  closed  values  of  the  same  type.  If  Vo  V'  Vi  V'  for  any 
closed  value  V',  then  V0  -<v  Vx. 

Proof:  Again,  we  use  the  marking  technique.  Suppose  C[#Vo]  -*v  cl  assuming,  without 
loss  of  generality,  that  (?[•]  contains  no  marked  terms.  We  proceed  by  induction  on  n, 
where  an  active  subterm  of  the  form  ((#Vo)  V')  ( V '  any  closed  value)  appears  n  times  in 
the  reduction. 
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In  the  base  case,  n  =  0;  thus,  C\V i]  c/  trivially.  In  the  induction  case,  pick  the 
first  term  C'  in  the  reduction  sequence  with  an  active  subterm  of  the  form  ((#Vo)  V'). 
Let  C'  =  D[(#V o)  V'],  where  £>[•]  has  one  hole  and  the  hole  is  active.  We  know  that 
D[(#Vq)  V'\  -»xj  ci.  By  hypothesis,  D\V\  V']  ci.  Let  E[-]  be  the  context  where 
D\V\  V')  —*v  E{#V0)  and  £[•]  has  no  occurrences  of  #Vq.  Since  £[#Vo]  -»u  cl  with  (n  ~  1) 
reductions  of  the  form  ((#Vb)  V")  for  some  closed  value  V",  by  induction  we  conclude  that 
E\Vx\  ci.  The  lemma  now  follows  since  C[Vj]  -**„  E\V i]  -*v  c/.  ■ 

Theorem  2.3  Let  M  and  N  be  closed  terms  of  the  same  type.  Then  M  -<v  N  iff,  for  all 
vectors  V  of  closed  values, 

M  V  -»v  Vq  implies  N  V  -»v  V{  and  Vq  =  V{  if  either  is  a  numeral. 

Proof:  (=>)  Trivial. 

(<=)  By  induction  on  types.  Consider  first  the  base  case,  where  M  and  N  are  of  type  o. 
Suppose  C[-\  is  a  context  in  which  C[M]  -»t)  c/;  then  we  know  by  the  Activity  Lemma  that 
either  C[M'\  -»v  ct  for  any  M'  or  (Aa;.C[x])  M  cj.  In  the  first  case,  C[A]  c/  trivially. 
In  the  second  case,  since  M  must  reduce  to  some  numeral,  say  c/>,  it  must  be  the  case  that 
N  -»v  ci>.  Thus,  C[1V]  ci,  so  M  -<v  N. 

In  the  induction  case,  again  consider  any  C[-]  where  C[Af]  -<►„  c/.  Then  by  the  Activity 
Lemma,  either  C[M'}  -»v  ci  for  any  M'  or  (Ax.C[x])  M  c/.  In  the  first  case,  C[A]  ci 
trivially.  In  the  second  case,  M  Vo  for  some  closed  value  Vo-  Since  for  any  vector  V  of 
closed  values, 

MV^vV{ o'  implies  N  V -»vV{, 

it  follows  (using  the  empty  vector)  that  N  -»v  V\  for  some  closed  value  Vj.  By  hypothesis, 
for  any  closed  value  V', 

(M  V')  V  a  implies  (JV  V')  V  c,. 

By  the  induction  hypothesis,  M  V'  <v  N  V'  for  any  V' .  By  Lemma  A.  12,  since  M  =”b3  Vo 
and  N  =vobs  V\,  M  <v  N .  ■ 
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